Flexible Domain Prediction using Mixed Effects Random Forests
نویسندگان
چکیده
This paper promotes the use of random forests as versatile tools for estimating spatially disaggregated indicators in presence small area-specific sample sizes. Small area estimators are predominantly conceptualized within regression-setting and rely on linear mixed models to account hierarchical structure survey data. In contrast, machine learning methods offer non-linear non-parametric alternatives, combining excellent predictive performance a reduced risk model-misspecification. Mixed effects combine advantages regression with ability model dependencies. provides coherent framework based averages proposes bootstrap estimator assessing uncertainty estimates. We illustrate our proposed methodology using Mexican income-data from state Nuevo Le\'on. Finally, is evaluated model-based design-based simulations comparing traditional regression-based approaches averages.
منابع مشابه
Loan Default Prediction on Large Imbalanced Data Using Random Forests
In this paper, we propose an improved random forest algorithm which allocates weights to decision trees in the forest during tree aggregation for prediction and their weights are easily calculated based on out-of-bag errors in training. Experiments results show that our proposed algorithm beats the original random forest and other popular classification algorithms such as SVM, KNN and C4.5 in t...
متن کاملA Prediction Model for Mild Cognitive Impairment Using Random Forests
Dementia is a geriatric disease which has emerged as a serious social and economic problem in an aging society and early diagnosis is very important for it. Especially, early diagnosis and early intervention of Mild Cognitive Impairment (MCI) which is the preliminary stage of dementia can reduce the onset rate of dementia. This study developed MCI prediction model for the Korean elderly in loca...
متن کاملEvaluating Random Forests for Survival Analysis using Prediction Error Curves.
Prediction error curves are increasingly used to assess and compare predictions in survival analysis. This article surveys the R package pec which provides a set of functions for efficient computation of prediction error curves. The software implements inverse probability of censoring weights to deal with right censored data and several variants of cross-validation to deal with the apparent err...
متن کاملCustomer churn prediction using improved balanced random forests
Churn prediction is becoming a major focus of banks in China who wish to retain customers by satisfying their needs under resource constraints. In churn prediction, an important yet challenging problem is the imbalance in the data distribution. In this paper, we propose a novel learning method, called improved balanced random forests (IBRF), and demonstrate its application to churn prediction. ...
متن کاملSupervised Heterogeneous Domain Adaptation via Random Forests
Heterogeneity of features and lack of correspondence between data points of different domains are the two primary challenges while performing feature transfer. In this paper, we present a novel supervised domain adaptation algorithm (SHDA-RF) that learns the mapping between heterogeneous features of different dimensions. Our algorithm uses the shared label distributions present across the domai...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied statistics
سال: 2022
ISSN: ['1467-9876', '0035-9254']
DOI: https://doi.org/10.1111/rssc.12600